Second order bounds for Markov Decision Processes
نویسندگان
چکیده
منابع مشابه
Bounds for Markov Decision Processes
We consider the problem of producing lower bounds on the optimal cost-to-go function of a Markov decision problem. We present two approaches to this problem: one based on the methodology of approximate linear programming (ALP) and another based on the so-called martingale duality approach. We show that these two approaches are intimately connected. Exploring this connection leads us to the prob...
متن کاملFirst-Order Markov Decision Processes
Markov Decision Processes (MDPs) [7] have developed lately as a standard method for representing uncertainty in decision-theoretic planning. Traditional MDP solution techniques have the drawback that they require an explicit state space, limiting their applicability to real-world problems due to the large number of world states occurring in such problems. Recent work addresses this drawback via...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملON THE INFINITE ORDER MARKOV PROCESSES
The notion of infinite order Markov process is introduced and the Markov property of the flow of information is established.
متن کاملLoss Bounds for Uncertain Transition Probabilities in Markov Decision Processes
We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are pre-computed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Our approach analyzes the growth of errors incurred by stepping backwa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Mathematical Analysis and Applications
سال: 1981
ISSN: 0022-247X
DOI: 10.1016/0022-247x(81)90106-2